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Abstract 

Nonlinear time series analysis is an active field of research that studies the structure of complex 
signals in order to derive information of the process that generated those series, for understanding, 
modeling and forecasting purposes. In the last years, some methods mapping time series to network 
representations have been proposed. The purpose is to investigate on the properties of the series 
through graph theoretical tools recently developed in the core of the celebrated complex network 
theory. Among some other methods, the so-called visibility algorithm has received much attention, 
since it has been shown that series correlations are captured by the algorithm and translated in 
the associated graph, opening the possibility of building fruitful connections between time series 
analysis, nonlinear dynamics, and graph theory. Here we use the horizontal visibility alyorithm 
to characterize and distinguish between correlated stochastic, uncorrelated and chaotic processes. 
We show that in every case the series maps into a graph with exponential degree distribution 
P(k) ~ exp(— Xk), where the value of A characterizes the specific process. The frontier between 
chaotic and correlated stochastic processes, A = ln(3/2), can be calculated exactly, and some 
other analytical developments confirm the results provided by extensive numerical simulations and 
(short) experimental time series. 
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I. INTRODUCTION 

Concrete hot topics in nonlinear time series analysis [1| include the characterization of 
correlated stochastic processes and chaotic phenomena in a plethora of different situations 

n n 

including long-range correlations in earthquake statistics [2|, climate records |3|, noncoding 
DNA sequences jj], stock market {^j], urban growth dynamics js], or physiological series (3, S| 
to cite but a few, and chaotic processes [l[ BJl4|. 



Both stochastic and chaotic processes share many features, and the discrimination be- 
tween them is indeed very subtle. The relevance of this problem is to determine whether the 
source of unpredictability (production of entropy) has its origin in a chaotic deterministic 
or stochastic dynamical system, a fundamental issue for modeling and forecasting purposes. 
Essentially, the majority of methods that have been introduced so far rely on two 

major differences between chaotic and stochastic dynamics. The first difference is that 
chaotic systems have a finite dimensional attractor, whereas stochastic processes arise from 
an infinite-dimensional one. Being able to reconstruct the attractor is thus a clear evidence 
showing that the time series has been generated by a deterministic system. The devel- 
opment of sophisticated embedding techniques jl] for attractor reconstruction is the most 
representative step forward in this direction. The second difference is that deterministic sys- 
tems evidence, as opposed to random ones, short-time prediction: the time evolution of two 
nearby states will diverge exponentially fast for chaotic ones (finite and positive Lyapunov 
exponents) while in the case of a stochastic process such separation is randomly distributed. 
Whereas some algorithms relying on the preceding concepts are nowadays available, the great 
majority of them are purely phenomenological and often complicated to perform, computa- 
tionally speaking. These drawbacks provide the motivation for a search for new methods 
that can directly distinguish, in a reliable way, stochastic from chaotic time series. This 
is, for instance, the philosophy behind a recent work by Rosso and co-workers [28], where 
the authors present a 2D diagram (the so-called entropy-complexity plane) that relates two 
information-theoretical functionals of the time series (entropy and complexity), and com- 
pute numerically the coordinates of several chaotic and stochastic series in this plane. The 
purpose of this paper is to offer a different, conceptually simple and computationally efficient 
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method to distinguish between deterministic and stochastic dynamics. 

The proposed method uses a new approach to time series analysis that has been devel- 



oped in the last years 15|, Il8l-l22|. In a nutshell, time series are mapped into a network 



representation (where the connections between nodes capture the series structure according 
to the mapping criteria) and graph theoretical tools are subsequently employed to character- 
ize the properties of the series. Some methods sharing similar philoso phy include recurrence 
networks, cycle networks, or correlation networks to cite some (see |20| for a comparative 
review). Amongst these mappings, the so-called visibility algorithm [15] has received much 
attention, since it has been shown that series correlations (including periodicity, fractality 
or chaoticity) are captured by the algorithm and translated in the associated visibility graph 
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4l7]. opening the possibility of building bridges between time series analysis, nonlinear 
dynamics, and graph theory. Accordingly, several works applying such algorithm in several 
contexts ranging from geophysics [24( or turbulence [25] to physiology [26] or finance 



have started to appear [z 



Here we address the characterization of chaotic, uncorrelated and correlated stochastic 
processes, as well as the discrimination between them, via the horizontal visibility algorithm. 
We will show that a given series maps into a graph with an exponential degree distribution 
P(k) ~ exp(— AA;), where A < ln(3/2) characterizes a chaotic process whereas A > ln(3/2) 
characterizes a correlated stochastic one. The frontier \ un = ln(3/2) corresponds to the 
uncorrelated situation and can be calculated exactly 16], thus the method is well grounded. 
Some other features are calculated analytically, confirming our numerical results obtained 
through extensive simulations for Gaussian fields with long-range (power-law) and short- 
range (exponential) correlations and a plethora of chaotic maps (Logistic, Henon, time- 
delayed Henon, Lozi, Kaplan- Yorke, a- map, Arnold cat). Experimental (short) series of 
sinus rythm cardiac interbeats -which have been shown to evidence long-range correlations- 
are also analyzed. Moreover, we will also show that the method not only distinguishes but 
also quantifies (by means of the parameter A) the degree of chaoticity or stochasticity of the 
series. The rest of the paper is organized as follows: in section II we recall some properties 
of the method, and in particular we state the theorem that addresses uncorrelated series. In 
section III we study how the results deviate from this theory in the presence of correlations, 
through a systematic analysis of long-range and short-range stochastic processes. Results 
are validated in the case of experimental time series. Similarly, in section IV we address 
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time series generated through chaotic maps. In section V and VI analytical developments 
and heuristic arguments supporting our previous findings are outlined. In section VII we 
comment on the current limitations of the algorithm, and in section VIII we conclude. 



II. HORIZONTAL VISIBILITY ALGORITHM 
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FIG. 1: Graphical illustration of the horizontal visibility algorithm. A time series is represented 
in vertical bars, and in the bottom we plot its associated horizontal visibility graph, according to 
the geometrical criterion encoded in Eq. (prj (see the text). 



The horizontal visibility algorithm has been recently introduced 16] as a map between a 
time series and a graph and it is defined as follows. Let {£i}j=i,...,Ar be a time series of N real 
data. The algorithm assigns each datum of the series to a node in the horizontal visibility 
graph (HVG). Two nodes i and j in the graph are connected if one can draw a horizontal 
line in the time series joining and Xj that does not intersect any intermediate data height 
(see figure [H for a graphical illustration). Hence, i and j are two connected nodes if the 
following geometrical criterion is fulfilled within the time series: 

Xi, Xj > x n , Wn\i<n<j. (1) 

Some properties of the HVG can be found in 16]. Here we recall the main theorem for 

n 

random uncorrelated series, whose proof can also be found in |16| : 

Theorem (uncorrelated series) Let x,i be a bi-infinite sequence of independent and 
identically distributed random variables extracted from a continous probability density f(x). 



The degree distribution of its associated horizontal visibility graph is 

= !(!)" P) 

Note that P(k) can be trivially rewritten as P{k) ~ exp(— \ un k) with A un = ln(3/2). In- 
terestingly enough, this result is independent of the generating probability density f(x), (as 
long as it is a continuous one, independently on whether the support is compact or not). 
This result shows that there is an universal equivalency between uncorrelated processes and 
A = A un . In what follows we will investigate how results deviate from this theoretical result 
when correlations are present. 

III. CORRELATED STOCHASTIC SERIES 
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FIG. 2: Left: Semilog plot of the degree distribution P(k) of a Gaussian correlated series of 
N = 2 18 data with power-law decaying correlations C(t) ~ i~ 7 , for 7 = 1.0 and 7 = 2.0. showing 
an exponential function. P(k) ~ exp(— \k) in both cases, with slope A = 0.59 and A = 0.50 
respectively. For comparison, the shape of P(k) associated to a random uncorrelated series is shown, 
having \ un = ln(3/2) < A, V7. Right: Similar results associated to short-range correlated series 
generated through an Ornstein-Uhlenbeck process with correlation function C{t) ~ exp(— t/r). 

In order to analyze the effect of correlations between the data of the series, we focus on 
two generic and paradigmatic correlated stochastic processes, namely long-range (power-law 
decaying correlations) and Ornstein-Uhlenbeck (short-range exponentially decaying corre- 
lations) processes. We have computed the degree distribution of the HVG associated to 
different long-range and short-range correlated stochastic series (the method for generating 

5 



the associated series is outlined in the next section). In the left panel of Fig. ([2]) we plot 
in a semi-log scale the degree distribution for correlated series with correlation function 
C(t) = t~ 7 for different values of the correlation strength 7 G [10~ 2 — 10 1 ], while in the 
right panel of the same figure, we plot the results for an exponentially decaying correlation 
function C(t) = exp(— t/r). Note that in both cases the degree distribution of the associated 
HVG can be fitted for large k by an exponential function exp(— Xk). The parameter A 
depends on 7 or r and is, in each case, a monotonic function that reaches the asymptotic 
value A = X un = ln(3/2) in the uncorrelated limit 7 — > 00 or r — > 0, respectively. Detailed 
results of this phenomenology can be found in figure [3J and in the the right panel of figure M 
where we plot the functional relation A(7) and A(r). In all cases, the limit is reached from 
above, i.e. A > X un . Interestingly enough, for the power- law correlations the convergence 
is slow, and there is still a noticeable deviation from the uncorrelated case even for weak 
correlations (7 > 4.0), whereas the convergence with r is faster in the case of exponential 
correlations. 



Minimal substraction procedure 

In what follows we explain the method we have used to generate series of correlated Gaussian 
random numbers Xi of zero mean and correlation function (xiXj) = C(\i — j\). The classical 
method for generating such correlated series is the so-called Fourier filtering method (FFM). 
This method proceeds by filtering the Fourier components of an uncorrelated sequence of 
random numbers with a given filter (usually, a power-law function) in order to introduce 
correlations among the variables. However, the method presents the drawback of evidencing 
a finite cut-off in the range where the variables are actually correlated, rendering it useless in 
practical situations. An interesting improvement was introduced some years ago by Makse 
et. al 35| in order to remove such cut-off. This improvement was based on the removal of 



the singularity of the power-law correlation function C (t) ~ t~ 7 at r = and the associated 
aliasing effects by introducing a well defined one C(t) = (1+t 2 ) -7 / 2 and its Fourier transform 
in continuous-time space. Accordingly, cut-off effects were removed and variables present 
the desired correlations in their whole range. 

We use here an alternative modification of the FFM that also removes undesired cut-off 
effects for generic correlation functions and takes in consideration the discrete nature of the 
series. Our modification is based on the fact that not every function C (t) can be considered 
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to be the correlation function of a Gaussian field, since some mathematical requirements 
need to be fulfilled, namely that the quadratic form Ylij x iC{\i ~ j\) x j be positive definite. 
For instance, let us suppose that we want to represent data with a correlation function 
that behaves asymptotically as C(t) ~ t~"< ' . As this function diverges for t — > a regu- 
larization is needed. If we take C(t) = (1 + t 2 )~ 7//2 , then the discrete Fourier transform 
S(k) = N lj/2 YujLi ex P(^)C(j) turns out to be negative for some values of k, which is not 
acceptable. To overcome this problem, we introduce the minimal substraction procedure, 
defining a new spectral density as So(k) = S(k) — S m i n (k), being S m i n (k) the minimum 
value of S(k) and using this expression instead of the former one in the filtering step. The 
only effect that the minimal substraction procedure has on the field correlations is that C(0) 
is no longer equal to 1 but adopts the minimal value required to make the previous quadratic 
form positive definite. The modified algorithm is thus the following: 

• Generate a set {uj},j = 1, N , of independent Gaussian variables of zero mean and 
variance one, and compute the discrete Fourier transform of the sequence, 

• Correlations are incorporated in the sequence by multiplying the new set by the desired 
spectral density S(k), having in mind that this density is related with the correlation 
function C(r) through S(k) = iV 1//2 exp(irk)C(r). Make use of S (k) = S(k) — 
Smin(k) (minimal substraction procedure) rather than S(k) in this process. Concretely, 
the correlated sequence in Fourier space Xk is given by Xk = N lj/2 So(k) 1 ^ 2 Uk- 

• Calculate the inverse Fourier transform of X}. to obtain the Gaussian field Xj with the 
desired correlations. 



A. Application to real cardiac interbeat dynamics 

As a further example, we use the dynamics of healthy sinus rhythm cardiac interbeats, a 
physiological stochastic process that has been shown to evidence long-range correlations |7| ■ 
In figure 0] we have plotted the degree distribution of the HVG generated by a time series 
of the beat-to-beat fluctuations of five young subjects (21-34 yr) with healthy sinus rhythm 



heartbeat 



30]. Even if these time series are short (about 6000 data), the results match 



those obtained in the previous examples, namely, that the associated graph is characterized 
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FIG. 3: From left to right, up to bottom: Semilog plot of the degree distributions of horizontal 
visibility graphs associated to long-range correlated series with correlation function C(t) ~ t~ 7 , for 
different values of 7 (data are averaged over 100 realizations). In every case we find that the degree 
distribution is exponential P(k) ~ exp(— Xk), where the slope A monotonically decreases with 7. In 
figure 9 and 10 we plot the slope of such degree distribution for increasing values of the correlation 
strength 7: the convergence towards the uncorrelated situation (A = \ un = ln(3/2)) is slow, what 
allows us to distinguish correlated series from uncorrelated ones even when the correlations are 
very weak. 
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by an exponential degree distribution with slope A > X uri} as it corresponds to a correlated 
stochastic process. 

All these examples provide evidence showing that a time series of stochastic correlated 
data can be characterized by its associated HVG. This graph has an exponential node-degree 
distribution with a characteristic parameter A that always exceeds the uncorrelated value 
\ un = ln(2/3). This is true even in the case of weakly correlated processes (large values of 
the correlation exponent 7 in the case of power-law, long-range, decay of correlations, or 
small values of r in the case of an exponential, short-range, decay). 




1000 2000 3000 4000 5000 6000 7000 
t 



0.1 



0.01 



0.001 



Heartbeat series 
X=0.5 
uncorrelated 



6 8 10 12 14 

k 



FIG. 4: Semi-log plot of the degree distribution of the HVG associated to series of healthy subjects 



301 ] . These are a prototypical example of a long-range 



interbeat electrocardiogram of 6000 data 

correlated stochastic process 7j. The straight line characterizes the theoretical result for an un- 
correlated process. The degree distribution is exponential with A = 0.5 > X un , corresponding to 
a correlated stochastic process, as predicted by our theory. Results correspond to an average over 
five time series, one of them being depicted in the left panel. 



IV. CHAOTIC MAPS 

We now focus on processes generated by chaotic maps. In a preceding work [16], we 
conjectured that the Poincare recurrence theorem suggests that the degree distribution of 
HVGs associated to chaotic series should be asymptotically exponential. Here we address 
several deterministic time series generated by chaotic maps, and analyze the possible 
deviations from the uncorrelated results. Concretely, we tackle the following maps: 
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FIG. 5: From left to right, up to bottom: Semilog plot of the degree distributions of Horizontal 
visibility graphs associated to series generated through chaotic maps with different correlation di- 
mension (data are averaged over 100 realizations). In every case we find that the degree distribution 
is exponential P(k) ~ exp(— Xk), where the slope A monotonically increases with the correlation 
dimension D. In the bottom right we plot the functional relation between A and D, showing that 
the values of A converge towards the uncorrelated situation (A = X un = ln(3/2)) for increasing 
values of the chaos dimensionality. 
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(1) the a-map f(x) = 1 — \2x — 1| Q , that reduces to the logistic and tent maps in their fully 
chaotic region for a = 2 and a = 1 respectively, for different values of a, 

(2) the 2D Henon map (x t+ i = y t + l — ax\, y t+ i = bx t ) in the fully chaotic region (a = 1.4, 
b = 0.3); 

(3) a time-delayed variant of the Henon map: x t +i = bx t ~d + 1 — ax^ in the region (a = 1.6, 
b = 0.1), where it shows chaotic behavior with an attractor dimension that increases linearly 
with the delay d 32)]. This model has also been used for chaos control purposes 33], although 



here we set the parameters a and b to values for which we find high- dimensional chaos for 



almost every initial condition 



32|; 



(4) the Lozi map, a piecewise-linear variant of the Henon map given by x t +\ = 1 + y n — 
a\x t \, yt+i = bx t in the chaotic regime a = 1.7 and b = 0.5; 

(5) the Kaplan- Yorke map x t +\ = 2x t mod (l),y t +i = Xyt + cos(47nr t ) mod (1); and 

(6) the Arnold cat map x t+ ± = x t + y t mod {l),y t +i = x t + 2y t mod (1), a conservative 
system with integer Kaplan- Yorke dimension. References for these maps can be found in 

3i|. 

In figure[5]we plot in semi-log the degree distribution of chaotic series of 2 18 data generated 
through several chaotic maps (logistic, tent, a-map with a = 3 and 4, Henon, delayed Henon 
with a delay d = 10, Lozi, Kaplan- Yorke and Arnold cat). We find that the tails of the 
degree distribution can be well approximated by an exponential function P(k) ~ exp(— Xk). 
Remarkably, we find that A < X un in every case, where A seems to increase monotonically 
as a function of the chaos dimensionality [36] . with an asymptotic value A — > ln(3/2) for 
large values of the attractor dimension (see the right-hand side bottom of the figure where 
we plot the specific values of A as a function of the correlation dimension of the map 34j). 
Again, we deduce that the degree distribution for uncorrelated series is a limiting case of 
the degree distribution for chaotic series but, as opposed to what we found for stochastic 
processes, the convergence flow towards X un is from below, and therefore A = In (3/2) plays 
the role of an effective frontier between correlated stochastic and chaotic processes (see left 
part of Fig. |6] for an illustration) . 

A summary of all data series analyzed can be seen in the right panel of Fig. HI where 
we plot the fitted slope A of particular series generated through power-law correlated (as 
a function of correlation 7) and exponentially correlated (as a function of correlation time 
r) stochastic processes, and through the aforementioned chaotic maps (as a function of 
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FIG. 6: (Left) A diagram: for A < ln(3/2), we have a chaotic process, whereas A > ln(3/2) 
corresponds to a correlated stochastic process. The frontier value A = ln(3/2) corresponds to the 



lfl|. (Right) Plot 



uncorrelated case. Note that this latter value is an exact result of the theory 
of the values of A for several processes, namely: (i) for power-law correlated stochastic series with 
correlation function C(t) = i -7 , as a function of the correlation 7, (ii) for Ornstein-Uhlenbeck 
series with correlation function C(t) = exp(— t/r), as a function of the correlation time r, and (iii) 
for different chaotic maps, as a function of their correlation dimension D. Errors in the estimation 
of A are incorporated in the size of the dots. Notice that stochastic processes cluster in the region 
A > X U n whereas chaotic series belong to the opposite region A < X un , evidencing a convergence 



towards the uncorrelated value \ un 
dimensionality respectively. 



ln(3/2) 



16l ] for decreasing correlations or increasing chaos 



the correlation dimension D). In the following sections we will provide some analytical 
developments and heuristic arguments supporting our findings. 



V. HEURISTICS 

We argue first that correlated series show lower data variability than uncorrelated ones, so 
decreasing the possibility of a node to reach far visibility and hence decreasing (statistically 
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speaking) the probability of appearance of a large degree. Hence, the correlation tends to 
decrease the number of nodes with large degree as compared to the uncorrelated counterpart. 
Indeed, in the limit of infinitely large correlations (7 — > or r — > 00), the variability reduces 
to zero and the series become constant. The degree distribution in this limit case is, trivially, 

P(k) = 5{k - 2) = lim £ exp(-A|A; - 2|), 
A-+00 2 

that is to say, infinitely large correlations would be associated to a diverging value of A. This 
tendency is on agreement with the numerical simulations (right panel of figure ED where we 
show that A monotonically increases with decreasing values of 7 or increasing values of r 
respectively. Having in mind that in the limit of small correlations the theorem previously 
stated implies that A — > X un = ln(3/2), we can therefore conclude that for a correlated 
stochastic process \ s t oc h > A un . 

Concerning chaotic series, remember that they are generated through a deterministic pro- 
cess whose orbit is continuous along the attractor. This continuity introduces a smoothing 
effect in the series that, statistically speaking, increases the probability of a given node to 
have a larger degree (uncorrelated series are rougher and hence it is more likely to have 
more nodes with smaller degree). Now, since in every case we have exponential degree dis- 
tributions (this fact being related with the Poincare recurrence theorem for chaotic series 
and with the return distribution in Poisson processes for stochastic series [li]]), we con- 
clude that the deviations must be encoded in the slope A of the exponentials, such that 
Khaos < A Mn < Xstoch, in good agreement with our numerical results. 



VI. ANALYTICAL DEVELOPMENTS 



In 16] we proved that P(k) = (l/3)(2/3) fc_2 for uncorrelated random series. To find out 
a similar closed expression in the case of generic chaotic or stochastic correlated processes 
is a very difficult task, concretely since variables can be long-range correlated and hence 
the probabilities cannot be separated (lack of independence). This leads to a very involved 
calculation which is typically impossible to solve in the general case. However, some an- 
alytical developments can be made in order to compare them with our numerical results. 
Concretely, for Markovian systems global dependence is reduced to a one-step dependence. 
We will make use of such property to derive exact expressions for P(2) and P(3) in some 
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PouV) 



Plog{2) 



PlogW 



1.0 
0.5 
0.1 



0.3012 
0.3211 
0.3333 



0.232 
0.227 
0.222 



0.3333 



0.3332 



TABLE I: Numerical results of P(2) and P(3) associated to (i) an Ornstein-Uhlenbeck series of 
N = 2 18 data with correlation function C(t) = exp(— £/t), for different values of the correlation 
time r, and (ii) to a series of iV = 2 18 data extracted from a logistic map in its fully chaotic region, 
a-map with a = 2. To be compared with exact results derived in section VI. 

Markovian systems (both deterministic and stochastic). In order to compare the theoretical 
calculations of P(2) and P(3) in the case of an Ornstein-Uhlenbeck process (detailed in 
section III) with the numerical results, in table H] we have depicted the associated numerical 
results for different correlation times. 



A. Ornstein-Uhlenbeck process 

Suppose a short-range correlated series (exponentially decaying correlations) of infinite 
size generated through an Ornstein-Uhlenbeck process, and generate its associated HVG. 
Let us consider the probability that a node chosen at random has degree k = 2. This node 
is associated to a datum labelled Xq without lack of generality. Now, this node will have 
degree k = 2 if the datum first neighbors, x\ and x_i have values larger than xq: 

P(k = 2) = P(X-i > Xq n X X > Xq) 

If series data were random and uncorrelated, we would have 

/oo ^oo poo 

dx f{x ) dx-xf(x-x) dxxf(xx) = 1/3, (3) 
-oo J Xq J Xq, 

where we have used the properties of the cumulative probability distribution (note that this 
result holds for any continuous probability density f{x), as shown in jig). Now, in our case 
the variables are correlated, so in general we should have 

/oo poo poo 

dx / dx^i I dxxf(x-x,x ,xx). (4) 
-OO J XQ J Xq 
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We use the Markov property xq, xx) = f (ff-i).f (xn\x^)f(x-[ \xg). that holds for an 

Ornstein-Uhlenbeck process with correlation function C(t) ~ exp(— t/r) [37]: 

exp(-z 2 /2) . , exp(-(x 2 - K Xl ) 2 /2(l - K 2 )) 



/(*) 

where K = exp(— 1/r). 



f(x 2 \xi 



V2vr(l - K 2 ) 



(5) 



Numerical integration allows us to calculate Pot/ (2) f° r every given value of the correlation 
time t. For instance, we find Pot/(2)| r=1 . = 0.3012, P O i/(2)| T =0.5 = 0.3211, P O i/(2)| T=0 .i = 
0.3331, in perfect agreement with our previous numerical results (see tabled]). 
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FIG. 7: Schematic representation of a situation where datum x$ has right-visibility of two data 
(P+(2)), x\ and x%. An arbitrary number of hidden data can be placed between x\ and x 2 , and 
this has to be taken into account in the calculation of P(3). 

An arbitrary datum xq of a series extracted from an Ornstein-Uhlenbeck will have an 
associated node with degree k = 3 with a certain probability Pqu (3) which is the sum of 
the probabilities associated to two possible scenarios, namely (i) the probability that X q has 
two visible data in its right-hand side and a single one in its left-hand side, labeled Pq C/ (3), 
and (ii) the probability that xq has two visible data in its left-hand side and a single one 
in its right-hand side, labeled Pq U {?>). In the very particular case of stationary Markovian 
processes (such as the Ornstein-Uhlenbeck), time invariance yields Pot/ (3) = 2Pq U (3). Let 
us tackle now the calculation of P f [/ (3). Let us quote xx,x 2 the right-hand side visible data 
of Xq and x_i the left-hand side visible one. Formally, we have 



^(3) 



dx / dx- 1 f(x-i)f{x \x-i)P + (2\x ), 



(6) 



where P+(2|xo) is the probability that xq sees two data on its right-hand side (see figure [7J for 
a graphical illustration). Of course in P + (2|xq) we have to take into account the possibility 
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of having an arbitrary number of hidden (non visible) data between the first and the second 
visible datum, so 

/X POO 
dxi / dx 2 f(x 1 \x )f(x 2 \x 1 ) + (7) 
-oo J Xn 



oo J xo 
x pxi 



dxx dzx / dx 2 f(x 1 \x )f(z 1 \x 1 )f(x 2 \z 1 ) + 



-oo J — oo J xo 

XQ PXl PXl POD 



dxi dz x \ dz 2 dx 2 f{xi\x )f{z 1 \xi)f(z 2 \z l )f(x 2 \z 2 ) + ... 

oo J —oo J —oo J XQ 



oo 



= ^I(p\x ) 

p=0 

where f{x\y) is the Ornstein-Uhlenbeck transition probability defined in equation and z p 
is the p-th hidden data located between x\ and x 2 (note that there can be an eventually 
infinite amount of hidden data between x\ and x 2 and these configurations have to be taken 
into account in the calculation). Here J(p|xo) characterizes the probability that xq sees two 
data on its right-hand side with p hidden data between them. 
A little algebra allows us to write 

/XO 
dx 1 f(xi\xo)G p (x 1 ,xi,x ), (8) 
-oo 

where the function G p satisfies a recursive relation: 



G (x,y,z) = / f(h\y)dh, (9) 

J z 

/X 
dhf(h\y)G p -i(x,h,z), p>l. (10) 
-oo 

This is a convolution-like equation that can be formally rewritten as G p = TG p -i, or G p = 
T p Go, with an integral operator T = dhf(h\y). Accordingly, we have 

/xo °° rxo 

dxif(xi\x )'Y]G p (xi,x 1 ,x )= / dx 1 f(x l \x )S(xi,x 1 ,x Q ), (11) 
-oo p=0 J-oo 

where we have defined the summation S(x,y,z) as 

OO OO 

S(x, y } z) = Y J G P {x, y,z) = J2 T p G = Y~f G ^ ^ 

p=0 p=0 

where in the last equality we have used the summation and convergence properties of ge- 
ometric series (Picard sequence). This is valid whenever the spectral radius of the linear 
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operator r(T) < 1, that is, if 

l/n 



lim 

n— >oo 



IX" 1 1 



< 1, (13) 



where ||T|| = max 2/e (_ 00ja ;) J*^ is the norm of T. Now, this condition is trivially 

fulfilled given the fact that f(x\y) is a Markov transition probability. Then equation [T21 can 
be written as (1 — T)S = G , or more concretely 

S(x,y,z) = G (x,y,z) + dhf(h\y)S(x,h,z), (14) 



which is a Volterra equation of the second kind [38] for S(x,y,z). Note that it can also 
be seen as a multidimensional convolution- like equation since the argument in the Markov 
transition probability f(h\y) has the shape h — y', where y' = exp(— l/r)y. Hence / can be 
understood as the kernel of the convolution. 

Typical one-dimensional Volterra integral equations can be numerically solved applying 



quadrature formulae for approximate the integral operator [38|. The technique can be easily 
extended whenever the integral equation involves more than one variable, as it is our case. 
Specifically, a Simpson-type integration scheme leads to a recursion relation with a step S 
to compute the function S(x, y, z). One technical point is that one needs to replace the — oo 
limit in the integral by a suflicienly small number a. We have found that a = —10 is enough 
for a good convergence of the algorithm. Given a value of z the recursion relation 

S(a, a + nd, z) = G (a, a + n8, z) 

fc-i 

S(a + kd, a + n8, z) = Go(a, a + nS, z) + 5 f{a + i5\a + nd)S(a + (k — 1)5, a + i8, z) + 0((d^) 

for k — 0, 1, 2, . . . and n — 0, 1, . . . , k, allows us to compute S(x, y, z) for y < x. 

Summing up, the procedure to compute Pou{$) is the following: calculate S(x\,x\,xq) 
using the previous recursion relation and use this value to obtain P + (2|xo) from a numerical 
integration of the right-hand-side of equation [TT]) (again, the lower limit will be replaced by 
x\ = a). Finally, integrate numerically equation [6] to obtain Pq V (3) from which it readily 
follows Pot/ (3). Applying this methodology for an integration step 5 = 4 x 10~ 3 we find 
Por/(3)| T =i.o = 0.230, Poc/(3)| r=0 .5 = 0.226, or Poc/(3)| r=0 .i = 0.221, in good agreement with 
numerical results (tabled]). 
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B. Logistic map 



A chaotic map of the form x n+ \ = F(x n ) does also have the Markov property, and 
therefore a similar analysis can therefore apply (even if chaotic maps are deterministic). 
For chaotic dynamical systems whose trajectories belong to the attractor, there exists a 
probability measure that characterizes the long-run proportion of time spent by the system 
in the various regions of the attractor. In the case of the logistic map F(x n ) = fix n (l — x n ) 
with parameter /i = 4, the attractor is the whole interval [0, 1] and the probability measure 
f(x) corresponds to the beta distribution with parameters a = 0.5 and b = 0.5: 

f( x \ = y x) . (i6) 

M ; B(0.5,0.5) V ; 

Now, for a deterministic system, the transition probability is 

f(x n+ i\x n ) = 5(x n+ i - F(x n )), (17) 

where S(x) is the Dirac delta distribution. Departing from equation HJ for the logistic map 

F(x n ) = 4x n (l — x n ) and x n G [0, 1], we have 

-P«o 9 (2) = / dx f{x-i)f(x \x- 1 )dx^ 1 I f(x 1 \x )dx 1 = 

J J xq J xo 

1 



dxo / f(x-i)5(xo — F(x-\))dx-i / 5(x\ — F(xo))dxi. (18) 

J xq J xo 

Now, notice that, using the properties of the Dirac delta distribution, J^ 1 8{x\ — F{xo))dx\ 
is equal to one iff F(xq) G [xq, 1], what will happen iff < xq < 3/4, and zero otherwise. 
Therefore the only effect of this integral is to restrict the integration range of xq to be [0, 3/4]. 
On the other hand, 

I f(x^)S(x - F(x_i))dar_i = ^ f(xl)/\F\x* k )\, 

Jxo xJ|F(xJ)=x 

that is, the sum over the roots of the equation F(x) = x$, iff F(x_i) > Xq. But since 
x_i G [xq, 1] in the latter integral, it is easy to see that again, this is verified iff < x Q < 3/4 
(as a matter of fact, if < xq < 3/4 there is always a single value of X-i G [xq, 1] such that 
F(x-i) = xq, so the sum restricts to the adequate root). It is easy to see that the particular 
value is x* — (1 + y/1 — x )/2. Making use of these piecewise solutions and equation [TH we 
finally have 

f 3/4 f(x*) 

P log (2) = / J L— dx = 1/3, (19) 

J Ay/1 - X 
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which is in perfect agreement with the numerical results (see tabled]). Note that a similar 
development can be fruitfully applied to other chaotic maps, provided that they have a well 
defined natural measure. 

The approach for analytically calculating P£ g (3) in the case of a chaotic map with a well 
defined natural measure -such as the logistic map in its fully chaotic region \l = 4.0- is very 
similar to the one adopted for an Ornstein-Uhlenbeck process, again replacing the proba- 
bility density and Markovian transition probability with equations [16] and [TT] Remarkably, 
applying the properties of the Dirac delta and the logistic map it can be easily proved that 
1(0) = 1 and I(p) = Vp > provided that x is restricted to the range 3/4 < x < 1. The 
whole calculation therefore reduces to 

P 4(3) = f 77T=^o = 1/6, (20) 

.73/4 4V1 - Xq 

that yields Pi og (3) = 2P£ (3) = 1/3, in perfect agreement with numerical results (see table 
H]). Again, similar developments can be straightforwardly applied to other chaotic maps with 
well defined natural measure. 



VII. COMMENT ON NOISY PERIODIC MAPS 

Periodic series have an associated HVG with a degree distribution formed by a finite 
number of peaks, these peaks being related to the series period, what is reminiscent of 



the discrete Fourier spectrum of a periodic series [15|, ]16| . The reason is straightforward: a 
periodic series maps into an HVG which, by construction, is a repetition of a root motif. Now, 
if we superpose a small amount of noise to a periodic series (a so-called extrinsic noise), while 
the degree of the nodes with associated small values will remain rather similar, the nodes 
associated to higher values will eventually increase their visibility and hence reach larger 
degrees. Accordingly, the delta-like structure of the degree distribution will be perturbed, 
and an exponential tail will arise due to the presence of such noise. Can the algorithm 
characterize such kind of series? The answer is positive, since the degree distribution can be 
analytically calculated as it follows: Consider for simplicity a period-2 time series polluted 
with white noise (see the left part of figure E]for a graphical illustration). The HVG is formed 
by two kind of nodes: those associated to high data with values ((11,13,15, ...) in the figure) 
and those associated to data with small values ((z2,*4, i%, ■■■))■ These latter nodes will have, 
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FIG. 8: Left: Periodic series of 2 20 data generated through the logistic map ;c n +i = fJ-x n (l — x n ) 
for \i = 3.2 (where the map shows periodic behavior with period 2) polluted with extrinsic white 
gaussian noise extracted from a Gaussian distribution N(0, 0.05). Right: Dots represent the degree 
distribution of the associated HVG, whereas the straight line is equation [21] (the plot is in semi- 
log). Note that P(2) = 1/2, also as theory predicts, and that P(3) is not exactly zero due to 
boundary effects in the time series. The algorithm efficiently detects both signals and therefore 
easily distinguishes extrinsic noise. 

by construction, degree k = 2. On the other hand, the subgraph formed by the odd nodes 
(ii, {3, is, ...) will essentially reduce to the one associated to an uncorrelated series, i.e. its 
degree distribution will follow equation [2J Now, considering the whole graph, the resulting 
degree distribution will be such that 

P(2) = 1/2, 
P(3) = 0, 
P(k + 2) 

^P(k) = 



1 (2 



3 V3 
1 (2 



k-2 



k > 2, 



4 \3 



fe-3 



k > 4, 



(21) 



that is to say, introducing a small amount of extrinsic uncorrelated noise in a periodic 
signal introduces an exponential tail in the HVG's degree distribution with slope ln(3/2). 
In the left part of figure we plot in semi-log the degree distribution of a periodic-2 series 



20 



of 2 20 data polluted with an extrinsic white Gaussian noise extracted from a Gaussian 
distribution iV(0,0.05). Numerical results confirm the validity of equation [2T1 Note that 
this methodology can be extended to every integrable deterministic system, and therefore 
we conclude that extrinsic noise in a mixed time series is well captured by the algorithm. 
Conversely, introducing a small amount of intrinsic noise in a periodic series is more tricky. 
For instance, consider the noisy logistic map defined as 

x t+ i = /ix t (l - x t ) + ait-, 

where £t are independent random numbers extracted from a Gaussian distribution N(0, a) 
with zero mean and standard deviationa. For some values of fi < /ioo (that is, in the periodic 
regime of the associated noise-free logistic map), small amounts of intrinsic noise can produce 
orbits very similar to those generated by the noise-free version of the map in the chaotic 



regime [14]], in the sense that, superposed to the delta-like shape of P{k), an asymptotic 
exponential tail with A < \ un may eventually develop. Besides the delta-like structure of 
P{k) appearing for short values of k (reminiscent of the periodicity of the noise-free map), the 
algorithm fails in determining the source of entropy of the system (which is stochastic here, 
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where chaos and noise are 



and therefore A > A un ). This is a typical pathological case 
difficult to distinguish. Indeed, it has been pointed out that rather sophisticated methods 
such as finite size Lyapunov exponents or (e, r)— entropies have difficulties to determine the 



chaotic/stochastic nature of these maps for finite resolution 39|. This limitation of the 
algorithm should be investigated in detail in further work. 

VIII. CONCLUSION 

To conclude, we have shown that correlated stochastic series map into an horizontal vis- 
ibility graph with an exponential degree distribution with slope A > ln(3/2), that slowly 
tends to its asymptotic value for very weak correlations. Results are confirmed for a real 
physiological time series that has been previously shown to evidence long-range correlations 
3]. Similar results have been obtained for the case of chaotic series, with the peculiar- 
ity that the slope of the degree distribution converges to ln(3/2) in the opposite direction 
(A < ln(3/2)). In a preceding work we analytically proved that for an uncorrelated random 
series, the slope is exactly ln(3/2), independently of the probability density. We therefore 
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conclude that chaotic maps and correlated stochastic processes seem to belong to different 
regions of the A diagram, where A = ln(3/2) plays the role of an effective frontier between 
both processes. It is worth commenting that the horizontal visibility algorithm is very 
fast (as a guide, the generation of the associated graph for a series of N — 2 18 data in 
a standard personal computer takes a computation time of the order of a few seconds). 
Applications include direct characterization of complex signals such as physiological series 
or series extracted from natural phenomena, as a first step where to discriminate amongst 
several modeling framework approaches. Questions for future work include a deeper char- 
acterization of this method, concretely the incorporation of Lyapunov exponents and the 
associated short-term memory effects within the visibility framework, and the study of noisy 
maps, which hitherto constitute a limitation of the algorithm. The characterization of flows 
that produce continuous time series is also an open problem for future research. 
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IX. APPENDIX: STATISTICAL ERROR IN A 

The calculation of A comes straightforwardly from the fitting of the HVG's degree distri- 
bution (concretely, the tail) to an exponential function. Two possible sources of uncertainty 
in this calculation are present, namely: (i) the finite size effects associated to finite time 
series induce a lack of statistics for large values of the graph degree k, and (ii) experimental 
time series are often polluted with measurement errors. While a detailed and systematic 
analysis of these issues is beyond the scope of this work, at this point we can outline the 
following comments: (i) for the stochastic correlated and chaotic systems considered in this 
work, finite size effects seem to be unrelevant for relatively large time series (N > 2 14 ), in 
the sense that the region [k min , k max ] where an exponential function is very well fitted is 
large enough (see figures [3] and [5]) and accordingly, the error associated to A can be simply 
estimated as the error of the exponential fitting, (ii) In general, the procedure to check the 
effect of finite size is the following: consider a stationary series of N data. The error in the 
calculation of A can be estimated by partitioning the original series in s samples of N/ s data, 
labelled s\, Sjv/s- Accordingly, each series generates an HVG whose degree distribution 



22 



can be fitted to an exponential function with slope X Si , such that < A >= ^ Yli=i and 
the associated error is simply the standard deviation from the mean (this is equivalent to 
performing a time average), (iii) In practice, this latter procedure is not appropiate for very 
short time series (N of order O(10 3 )); in this case an ensemble average is better suited (note 
at this point that stationarity is needed in order to guarantee that averaging over ensembles 
and over time yield equivalent results). 

For illustration purposes, we address the case of a power-law correlated stochastic process 
with correlation function C(t) = t -7 , with 7 = 1.5, for which a previous analysis shows that 
the associated HVG has an exponential degree distribution with slope A = 0.54 (see figure 
[3]). For a given time series size N, we generate ten series and plot the degree distribution of 
the associated HVGs in figure [9j The statistical deviations associated to finite size effects 
decrease with N. 
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